Structural Information in Two-Dimensional Patterns: 
Entropy Convergence and Excess Entropy 
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We develop information-theoretic measures of spatial structure and pattern in more than one 
dimension. As is well known, the entropy density of a two-dimensional configuration can be efficiently 
and accurately estimated via a converging sequence of conditional entropies. We show that the 
manner in which these conditional entropies converge to their asymptotic value serves as a measure 
of global correlation and structure for spatial systems in any dimension. We compare and contrast 
entropy-convergence with mutual-information and structure-factor techniques for quantifying and 
detecting spatial structure. 
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I. INTRODUCTION 

The past decade has seen considerable advances in our 
understanding of general ways to detect and quantify pat- 
tern in one-dimensional systems. This work is of intrinsic 
and general interest, since it suggests different ways of 
viewing patterns and calls attention to some of the sub- 
tleties associated with pattern discovery and quantifica- 
tion , issues that — implicitly or explicitly — underlie 
much of the scientific enterprise. 

Recently, these abstract measures of structural com- 
plexity or pattern played a key role in several applica- 
tions in physics and dynamical systems. For example, 
there is a growing body of work that seeks to relate the 
structural complexity of a one-dimensional sequence to 
the difficulty one encounters when trying to learn or syn- 
chronize to the generating process ||, ^, |[ |^. Also, com- 
plexity measures have recently been used to characterize 
experimentally observed structures in a class of layered 
materials known as polytypes 1^. 

The successes in one dimension have not been readily 
followed by similar advances in two dimensions. Nonethe- 
less, the development of a general measure of complexity 
— or pattern or structure — for two-dimensional sys- 
tems is a longstanding goal. How is information shared, 
stored, and transmitted across a two-dimensional lattice 
to produce a given set of configurations? How can we 
quantitatively distinguish between different types of or- 
dering or pattern in two dimensions? Though largely 
answered in one dimension, these questions are open in 
higher dimensions. 

One oft-used set of techniques for examining patterns is 
Fourier or spectral analysis. This approach is well suited 
to detecting periodic ordering when the wavenumber of 
the transform matches the periodicity of the pattern. 



However, these methods typically rely on two-variable 
correlation functions. As such, they are incapable of dis- 
tinguishing structures that differ in their correlations over 
more than two variables, as we shall see below. 

Some recent work in this area, motivated in part by the 
need to characterize complex interfaces in surface science 
and geology @, |, |, |lO[ |ll|, |l2Hl|| , has suggested a set 
of approaches to these questions that are similar in spirit 
to fractal dimensions, in the sense that these approaches 
involve coarse-graining variables and then monitoring the 
changes that result as the coarse-graining scale is mod- 
ulated. One can also use a multifractal approach, also 
known as the singularity spectrum, "/(a)", the thermo- 
dynamic formalism, and t he fluctuation spectrum; for 
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reviews, see, e.g. 
can be applied to spatial structures, but they suffer sev- 
eral drawbacks. For one, they are not fully spatial, in 
the sense that their calculation requires one to discard 
spatial information. Second, they do not directly speak 
to the correlation present in a system; rather they are 
more measures of entropy, disorder, and inhomogeneity. 

Other recent general approaches to pattern in two di- 
mensions include the extension of the formal theory of 
computation and an information-theoretic approach 
JlSf somewhat similar in spirit to that which we develop 
below. See also Ref. 

In this work, we take a different approach to the 
question of pattern and structure in two spatial di- 
mensions. Our starting point is the excess entropy, 
an information-theoretic measure of complexity that is 
commonly used and well understood in one dimension 



commonly used and well under 
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|29|| . Our main goals 
are severalfold. First, wc introduce three ways to extend 
the definition of excess entropy to more than one dimen- 
sion, noting that these extensions are not equivalent. Sec- 
ond, we report results of estimating two of these forms 
of excess entropy for a standard statistical mechanical 
system: the two-dimensional Ising model with nearest- 
and next-nearest-neighbor interactions on a square lat- 
tice. We show that these two forms of excess entropy 
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are similar but not identical, that each is sensitive to the 
structural changes the system undergoes, and that they 
are able to distinguish between different patterns that 
have the same structure factors. Third, we discuss some 
of the subtleties and challenges associated with moving 
from a one- to a two-dimensional information-theoretic 
analysis of pattern and structure. 



We now examine the behavior of the Shannon entropy 
H{L) of S^. The total Shannon entropy of length-i 
sequences — the block entropy — is defined by 



H{L) = - J2 Pr(s^)log2Pr(s^) 



Graphically, we represent this as 



(4) 



II. ENTROPY AND ENTROPY 
CONVERGENCE IN ONE DIMENSION 

We begin by reviewing information-theoretic quantities 
applied to one-dimensional (ID) systems. This allows us 
to define quantities and to fix notation that will be useful 
in our discussion of two-dimensional (2D) information 
theory in the subsequent section. 

Let X be a random variable that assumes the values 
X G X, where A" is a finite set. We denote the probability 
that X assumes the particular value x by Pr(a;). Like- 
wise, let y be a random variable that assumes the values 
y € y. The Shannon entropy of the random variable X 
is defined by: 



H[X] = -^Pr(a;)log2Pr(x) 
xex 



(1) 



The entropy H[X] measures the average uncertainty, in 
units of bits, associated with outcomes of X. The condi- 
tional entropy is defined by 

H[X\Y] ^ ~ ^<x,y)^og^^r{x\y) (2) 

and measures the average uncertainty associated with 
variable X, if we know the outcome of Y. Finally, the 
mutual information between X and Y is defined as 



I[X;Y] = H[X] - H[X\Y] 



(3) 



Thus, Y carries information about X to the extent that 
knowledge of Y reduces one's average uncertainty about 
X. The above three definitions are all standard; for de- 
tails, see, e.g., Ref. [M. 



A. Block Entropy and Entropy Density 

Now consider a ID chain . . . S'_2'5'-iS'o'5'i . . . of random 
variables Si that range over a finite set A. This chain may 
be viewed as a ID spin system, a stationary time series 
of measurements, or an orbit of a symbolic dynamical 
system. We denote a block of L consecutive variables 
by S^ = Si . . . Sl- The probability that the particular 
L-block occurs is denoted Pr(s^). We shall follow 
the convention that a capital letter refers to a random 
variable, while a lower case letter denotes a particular 
value of that variable. 



H{L) = H[l 



(5) 

The sum in Eq. (||) is understood to run over all possible 
blocks of L consecutive symbols. The entropy density is 
then defined as 



lim 



HiL) 



(6) 



The above limit exists for all spatial-translation invariant 
systems |3^. Eqs. (||) and (|), together, are equivalent 
to the Gibbs entropy density. However, the information- 
theoretic vantage point allows us to form another ex- 
pression for the entropy density, one that will lead to a 
measure of structure. 

The entropy density ft.^ can be re-expressed as the limit 
of a form of conditional entropy. To do so, we first define 



h^iL) = H[Sl\Sl-iSl-2---Si] 



(7) 



fj,{L) is the entropy of a single spin conditioned 



In words, 

on a block of 1^1 adjacent spins. This can also be written 
graphically 



L-l- 



(8) 



The pictogram on the right indicates that the entropy is 
conditioned on the L— 1 spins directly to the right of the 
single target spin with the bold vertical lines denoting 
the boundary where the target spin and spin block abut. 
One can then show that the entropy density defined in 
Eq. m) can be written as: 



lim hfj_{L) 



(9) 



For a proof that the limits in Eqs. (j|) and (^ are equiva- 
lent, see Ref. [Q. As the block length L grows, the terms 
in Eq. (H) typically conver ge t o hf^ much faster than those 
in Eq. (ph. See, e.g., Ref. and citations therein. 



B. Excess Entropy 



The entropy density measures the randomness or un- 
predictability of the system; /i^ is the randomness that 
persists even after correlations over infinitely long blocks 
of variables are taken into account. A complementary 
quantity to the entropy density is the excess entropy E 
1^, H H ll |26[ |27[ H. The excess entropy 
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may be viewed as a measure of the apparent memory or 
structme in the system. 

The excess entropy is defined by considering how the 
finite-L entropy density estimates h^{L) converge to their 
asymptotic value ft,^. For each L, the system appears 
more random than it actuaUy is by an amount hf^{L) — hp_. 
Summing up these entropy-density overestimates gives us 
the excess entropy: 



Er 



(10) 



L=l 



The excess entropy thus measures the amount of apparent 
randomness at smah L values that is "explained away" 
by considering correlations over larger and larger blocks. 
The subscript in Ec indicates that this form of excess 
entropy is defined by considering how the entropy density 
converges to h^. 

Another expression for the excess entropy is obtained 
by looking at the growth of the block entropy H{L). By 
Eq. (^), we know that H{L) typically grows linearly for 
large L. The excess entropy can be shown to be equal to 
the portion of H{L) that is sublinear — E is the suhexten- 
sive part. That is, the excess entropy is defined implicitly 
by: 



H{L) = Es 



hf^L 



as L 



(11) 



Here, the subscript "S" on Es serves as a reminder that 
this expression for the excess entropy is the subextensive 
part of iJ(L). 

Finally, one can show ^ that the excess entropy 
is also equal to the mutual information between two ad- 
jacent semi-infinite blocks of variables; 



Ei = hm I[S^L...S^2S-i;SoS, 

L — >oc 



lim /|"[ 



-L L - 

nj; HE 



(12) 
(13) 



The "I" in the subscript indicates that this expression for 
the excess entropy is given in terms of a mutual informa- 
tion. Note that in the pictographic version, Eq. (|l3|), the 
two semi- infinite blocks are understood to be adjacent, 
as indicated by the thick vertical lines. 

The three different forms for the excess entropy — Ec, 
Es, and Ej — given above are all equivalent in one di- 
mension ^ . We represent these different forms with 
distinct symbols because they are not identical in two di- 
mensions. 

In the subsequent section we compare our results for 
the excess entropies with various structure factors — 
standard quantities from statistical physics used to detect 
periodic structure. The definition of the structure factor 
begins with the two-spin correlation function: 



r,,; 



{{St - {Si)){Sj - (Sj))) 



(14) 
(15) 



where Si and Sj denote the value of spins at different 
lattice coordinates. The second equality follows from 



the translation invariance of configurations. The angular 
brackets indicate a thermal expectation value. In 2D we 
will be interested in spins that are separated horizontally 
or vertically, but not both. (In a scattering scenario, 
this corresponds to restricting ourselves to a situation in 
which the particles to be scattered are incident along a 
line parallel to one of the axes of the lattice.) We de- 
fine r(r) as the correlation function between two spins 
separated, horizontally or vertically, by r lattice sites: 



r(r) 



{SQSr 



(16) 



The structure factor, then, is the discrete Fourier trans- 
form of the correlation functions: 



Sip) = E' 



27rr 
P 



r(r) 



(17) 



If the correlation function has a strong period-p com- 
ponent, then S{p) is large; if not, S(j>) is small. The 
absolute magnitude of S{p) is generally not interpreted; 
only the relative change as a function of p is. In this way, 
the structure factor serves as a signal of correlations in a 
configuration at a given periodicity. 

It is widely held that the excess entropy E serves as a 
general purpose measure of a system's structure, regular- 
ity, or memory; for recent reviews, see |2^, The 
excess entropy provides a quantitative measure of struc- 
ture that may be applied to any ID symbolic string. In 
Refs. j2^, we argued that E may be viewed as an 

effective order-parameter for ID spin systems. In partic- 
ular, we showed that the excess entropy is sensitive to 
periodic structure at any period, whereas structure fac- 
tors, by construction, are sensitive to ordering at only a 
single spatial period. We shall return to this point be- 
low and show that the same general claim holds in two 
dimensions as well. 



III. TWO-DIMENSIONAL ENTROPY, 
ENTROPY DENSITY, AND EXCESS ENTROPY 

A. Generalizing to Higher Spatial Dimensions 

Below we discuss how to extend the ID analysis out- 
lined above to apply to spatial patterns in two and higher 
dimensions. Before launching into definitions and for- 
malism, we sketch some of the philosophy and intuitions 
that motivate the path we take and highlight some of 
the general issues that arise as one moves from ID to 2D 
systems. 

Patterns in two dimensions are fundamentally differ- 
ent than those in one dimension. For example, in one 
dimension a natural way to scan a configuration exists: 
left-to-right, say. That is, each local variable is indexed 
in a well defined order. (The information-theoretic mea- 
sures discussed in the previous section have the same val- 
ues regardless of whether the ID configuration is scanned 
left-to-right or right-to- left.) 
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The ID approach simply does not generahze to 2D in 
such a unique, natural way. One might be tempted to 
scan or parse a 2D configuration by taking a particular 
ID path through it. One would then apply ID mea- 
sures of randomness and structure to the sequences thus 
obtained. For example, in Refs. 35 1, a space-filling 
curve is used to parse a 2D configuration and, from this, 
the entropy density of the configuration is estimated. 

While the ID-path method does yield the correct en- 
tropy density, it is also clear that it projects additional, 
spurious structure onto the configuration. By snaking 
through the lattice, it is inevitable that sites, adjacent 
in the 2D lattice, occur far apart in the ID sequence. 
As a result, long-range correlations appear in the latter. 
Thus, a ID excess entropy (or any other ID measure 
of structural complexity) adapted in this way will cap- 
ture not only properties of the 2D configuration, but also 
properties of the path. Except in special cases and with 
appropriate prior knowledge, it does not appear possi- 
ble to disentangle these two distinct sources of apparent 
structure. These, and related difficulties with the ID ap- 
proach have been discussed in some detail in, for example, 
Refs. [|2[ |8[ |6). 

Here, we seek an alternative to understanding a 2D 
pattern by parsing it into ID strings. We are immediately 
faced with a problem, however. There is a unique, com- 
plete ordering of the connected, nested subsets of a ID 
lattice such that the conditional entropies of the target 
spin, conditioned on this sequence of subsets, are mono- 
tonic decreasing. It is this ordering that makes Eq. ( p^ ) 
unambiguous and unique in ID. In contrast, connected, 
nested subsets of a 2D lattice that have this monotonic 
property are not unique. This is a direct consequence of 
the topological differences between one and two dimen- 
sional lattices. We shall see that this lack of uniqueness 
introduces ambiguity in extending Eq. ( |l0|) to two dimen- 
sions; specifically, there is no natural, unique expression 
for the excess entropy in two dimensions. 

This lack of uniqueness is not a cause for concern. In 
fact, it seems a desirable property. Given the richness 
and subtleties of 2D patterns, one would expect that it 
would take more than one (or even several) complexity 
measures to adequately capture the range of 2D struc- 
tures and orderings. These different measures will cap- 
ture different features of the 2D configuration. As such, it 
is particularly important to specify the context in which 
a complexity measure is to be used and state what the 
measure is intended to capture, as we and others have 
argued elsewhere ^ ^ . 

As an example of this non-uniqueness in 2D, consider 
what occurs when one moves from calculus of one variable 
to multi-dimensional calculus. In ID calculus, the deriva- 
tive is well defined for all smooth curves; the derivative is 
simply a number. In contrast, in 2D the derivative is not 
unique at each point on a surface; one must also specify 
the direction in which it is taken. There is a subspace (the 
tangent plane) of first derivatives of a smooth surface at 
any single point. A similar scenario appears to hold for 



the excess entropy in two dimensions. In Ref. p9| | we syn- 
thesized a number of information-theoretic approaches to 
structure in one dimension by developing an analysis in 
terms of discrete derivatives and integrals. We expect 
that similar (although not unique) measures of structure, 
randomness, and memory can be developed for 2D sys- 
tems by making use of discrete calculus in two dimen- 
sions. The work presented below is a first step in this 
direction. 



B. Entropy Density 

The entropy density in two dimensions is defined in 
the natural way. Consider an infinite 2D square lattice of 
random variables Sij whose values range over the finite 
set A. Assuming that the variables are translationally 
invariant, the 2D entropy density is given by: 



lim 



H{N, M) 
NM 



(18) 



where H{M, N) is the Shannon entropy of an x M 
block of spin variables. This limit exists for a transla- 
tionally invariant system, provided that the limits are 
taken in such a manner that the ration N/M remains 
constant and finite. 

Is there a way to re-express the 2D entropy density 
of Eq. (^ as the entropy of a target variable condi- 
tioned on a block of neighboring variables, analogous to 
Eq. ^7 This question was, to our knowledge, first an- 
swered in the affirmative by Alexandrowicz in the early 
1970's Q lljl Meirovitch ||| and later Schlijper and 
co-authors [l44|, Q extended and applied Alexandrowicz 's 
work. These methods have also been discovered indepen- 
dently by Eriksson and Lindgren ^ and Olbrich et 
al. |48| . Here we briefly summarize the central result and 
adapt it to our needs. 
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FIG. 1: Neighborhood templates for 2D conditional entropies. 
The target spin is denoted with an X. 

The most general approach to the conditional entropy 
in two dimensions proceeds as follows. Let h^{M) denote 
the Shannon entropy of the target spin conditioned on a 
2D neighborhood template of 2M(M-f 1) spins. Arrange 
the spin template in an (M-f 1) x (2M-f 1) rectangle, with 
the target spin in the center of the rectangle's top row 
and with the top, rightmost M spins deleted from the 
template. A sequence of neighborhood templates of this 
type is shown in Fig. |^. For example, /ip(3) is the entropy 
of the target spin (denoted by an X) conditioned on all 
the other spins in the rightmost template of Fig. ^ The 
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2D entropy density may then be shown to be equal to 



hf, = lim hf,{M) . 

M^OO 



(19) 



If it is known that the interactions between spins are of 
finite range, then one only needs to use a shape as thick 
as the interaction range |4j, For example, 

the following section we consider a 2D Ising model with 
nearest- and next-nearest- neighbor interactions. In this 
case, one uses a strip with a thickness of two lattice sites; 
see Fig. || 
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FIG. 2: Target spin (X) and neighborhood templates for con- 
ditional entropies used in our study of the 2D NNN Ising 
model. The cell numbers indicate the order in which the sites 
are added to the template. For more discussion, see text. 

We now slightly modify the definition of the template- 
size parameter AI in the conditional single-site entropy 
hfj,{M) so as to apply to the scenario in Fig. |^. The 
cell numbers in this figure indicate the order in which 
individual sites are added to the neighborhood template. 
For example, /i^(3) now will denote the entropy of the 
target spin (^oo) conditioned on the three spins labeled 
1 (^-lo), 2 (^oi), and 3 that is. 



hfi{3) — H[Soo\S^io, Soi, S-ii] 



(20) 



In the M oo limit, the new h^{M) still goes to the 
entropy density, as in Eq. ( p^ ) . It is also not hard to see 
that this convergence must be monotonic: 



h^{M) < h^iM') , M>M' 



(21) 



This is a direct consequence of the fact that conditioning 
reduces entropy that is, the conditional entropy of 
a variable cannot increase as a result of increasing the 
number of variables upon which it is conditioned. 

A few remarks about the neighborhood template in 
Fig. H are in order. First, the strip needs to be two sites 
thick since the system explored below has interactions 
that extend across two lattice sites. In this case, a strip 
with a thickness of two sites shields one half of the lat- 
tice from the other. In the limit that the strip is infinitely 
long in the horizontal direction, then the probability dis- 
tribution of the target spin is independent of the values 
of the spins beneath the strip . 

Second, at first blush, the numbering scheme in Fig. |^ 
appears ambiguous. Spins are added to the template in 
order of increasing Euclidean distance from the target 
spin. For example, spin 10 is a Euclidean distance 2^/2 
from the center spin, whereas spin 11 is a distance of 3. 
bmce 2^/2 < 3, one adds on spin 10 before 11. When 
there is a tie, one adds the leftmost spin. For example. 



spins 3 and 4 are the same Euclidean distance from the 
center spin; spin 3 comes before 4 since it is to the left. 

Of course, one can use alternative ordering schemes, 
such as adding spins in a widening spiral or some other 
geometric pattern. These choices do not change the result 
in Eq. (p^), since this is a statement about what happens 
in the limit that an arbitrarily large number of spins have 
been added to the template. However, looking ahead, the 
order in which spins are added can affect the convergence 
form of the 2D excess entropy — the 2D analog of Ec of 
Eq. (0). 

As noted above, the ambiguity in how the neighbor- 
hood template of conditioning variables grows is a direct 
result of the fact that a 2D lattice does not specify a strict 
ordering of its elements in the way that a ID sequence 
does. Rather, a 2D lattice specifies a partial ordering 
of its elements. Thus, there will always be "ties" in the 
sense just mentioned, and so there is no unique, natural 
way to add on the spins one-by-one based on an order- 
ing of subsets of spin blocks. See Ref. |Q for a detailed 
discussion of this, albeit in a slightly different context. 

Third, there is a physical motivation for the neigh- 
borhood template of Fig. ^ articulated by Kikuchi [ p2| . 
Picture a crystal growing by adsorbing one particle at 
a time. One can imagine that particles are added one- 
by-one, left to right, on top of already formed layers of 
the solid. This is exactly the process captured by the 
templates of Figs. |l| and ||. 

As remarked above, the conditional Shannon entropy 
method for calculating the entropy density ft,^ is well 
known and has been successfully applied to a number 
of different systems. For example, in Ref. |Q Schli- 
jper and Smit form upper and lower bounds for the 
entropy using block probabilities. They combine these 
bounds to obtain impressively accurate results for the 
entropy of the 2D Ising model and the g = 5, 2D Potts 
model. This method for calculating the entropy has also 
been applied to the Ising model on a simple cubic lat- 
tice iQ, a 2D hard-square lattice gasjM, the three- 
dimensional fee Ising antiferromagne t l54| , coupled map 
lattices |48|, Gaussian random fields ||55|| , polymer chain 
models |56|, and network- forming materials |Q. Quite 
recently, Meirovitch |Q estimated the entropy for the 
2D Ising ferromagnet. Remarkably, his results have only 
a 0.01% relative error at the critical temperature, where 
one might expect the conditional entropy form to overes- 
timate the entropy density due to long-range correlations 
missed by finite-size templates. 



C. Excess Entropy in Two Dimensions 

We now turn to the question of how to extend excess 
entropy to more than one dimension. In Sec. II B we 



saw that there were three different forms for the excess 
entropy: Ec, obtained by looking at how the entropy 
density converges to its asymptotic value; Ej, the excess 
entropy defined via a block-to-block mutual information; 
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and Es, the excess entropy as the subextensive part of 
the total entropy H{L). In this section we consider three 
possible approaches to excess entropy in two dimensions. 
For each, we begin with one of the three different forms 
for the ID excess entropy. 

First, consider the convergence excess entropy Ec, as 
defined in Eq. (p^). In the previous section we defined 
a sequence of 2D entropy density estimates ft.^(M) that 
converges from above to the entropy density h^. We can 
sum these entropy density over-estimates to obtain the 
2D convergence excess entropy: 



Ef 



Y.{K{M)~h^). 



(22) 



We shall see that this form of the excess entropy is, like its 
ID cousin, capable of capturing the structures or corre- 
lations present in a 2D system. Note that this definition 
can depend on the order in which spins are added on to 
the template and, as discussed in the previous section, 
there is no unique ordering to use to determine the se- 
quence in which to add sites. Nevertheless, our investiga- 
tions have shown that any reasonable choice for ordering 
yields an Ec that behaves qualitatively the same as that 
defined in Eq. (^). 

The mutual information form Ej of the excess entropy, 
defined in Eq. (|3|), can naturally be extended by con- 
sidering the mutual information between two adjacent, 
infinite half-planes. 



-M - 



-M - 



Et 



lim 

M.N^t 



T 

N 

i 



N 

i 



(23) 



As in Eq. (13), it is understood that the two semi- infinite 
planes are adjacent. 

Finally, one may also develop an expression for 
2D subextensive excess entropies by considering how 
H{M, N) grows with M and N. In analogy to Eq. ([u]), 
we define three subextensive excess entropies via: 



-M - 



H{M, N) 



H 



T 

N 

I 



(24) 



dard system: the 2D spin-1/2 Ising model with nearest- 
neighbor (NN) and next-nearest-neighbor (NNN) inter- 
actions. We choose this system since it is rich enough to 
exhibit several distinct structures and due to its broad 
familiarity. Its Hamiltonian Ti. is given by: 

= —Ji ^ SijSki 

<ij,fei>„„ 

—J2 ^ SijSki — -B^Sij , (26) 

where the first (second) sum is understood to run over 
all NN (NNN) pairs of spins. Each spin Sij is a binary 
variable: Sij G {— The lattice consists oi N x N 
spins; the spatial indices on spin variables run from to 
N-1. 

We estimated the structure factors S{1), S{2), and 
5(4) with Eq. ( p7| ) by directly measuring the frequency 
of occurrence of SiSj and s in spin configurations gener- 
ated by a Monte Carlo simulation that used a standard 
single-site Metropolis algorithm on a lattice with peri- 
odic boundary conditions. That is, we sampled configu- 
rations with the canonical distribution: a configuration's 
probability is proportional to e~^'--'^^/'^ , where Ti{c) is the 
energy of the configuration c and T is the temperature. 
We used a lattice of 48 x 48 spins. Since we are not inter- 
ested here in extracting the system's critical properties, 
there is no need to go to larger system sizes. 

We estimated Ec and Ej from block probabilities by 
observing the frequency of spin-block occurrences. To 
estimate Ec we used a template containing fifteen total 
spins, as shown in Fig. ^ and marginals of this distribu- 
tion for smaller template sizes. To estimate Ej we calcu- 
lated the mutual information of two adjacent 2x4 spin 
blocks. For each Ji value we ran our Monte Carlo simu- 
lation for up to 2 X 10^ Monte Carlo timesteps (2 x 10^ 
for Ji < —1.5) and then took data every 20 timesteps for 
2 X 10* timesteps. One Monte Carlo timestep corresponds 
to trying to flip, on average, each spin in the lattice one 
time. We thus sampled approximately 2 x 10^ template 
configurations. For comparison, note that there are at 
most (in the highly disordered regime) 2^^ « 3 x 10* 
possible configurations in a template of 16 spins. 



Es 



Epf 



hf,MN 



(25) 



Note that in an isotropic system, such as that considered 
below. Eg = Eg. We shall not consider these forms for 
the 2D excess entropy here, opting instead to focus on 
Ec and Ej. 



IV. RESULTS 

A. Next-Nearest-Neighbor Ising Systems 

To test the behavior of the different forms of the excess 
entropy, we estimated Ej and Ec numerically for a stan- 



B. Excess Entropy Detects Periodic Structure 

Our results are shown in Fig. |^. The temperature was 
held at T = 1.0, the external field a,t B = 0.0, and the 
next-nearest-neighbor coupling at J2 — —1.0. Figure ^ 
shows 5(1), 5(2), 5(4), Ei, and Ec, as a function of Ji e 
[—4.0, 4.0]. For all Ji values, the temperature is relatively 
small compared to the average energy per spin. And so, 
the configurations sampled are typically the ground state 
with a few low-energy excitations. 

As Ji is increased, the system moves through param- 
eter regimes in which there are significant correlations 
of period 2, 4, and 1. This is seen, for example, in the 
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FIG. 3: Structural changes in the the 2D NNN Ising model 
as a function of NN coupling Ji as revealed by structure fac- 
tors (a) 5(1), (b) S{2), and (c) 5(4), and excess entropies 
(d) Ec (convergence) and (e) Ei (mutual information). The 
temperature was fixed at T = 1.0 and J2 was held at —1.0 as 
the NN coupling was swept from Ji = —4.0 to Ji — 4.0 in 
steps of 5Ji = 0.01, except near the 5(1) spike at Ji ~ 2.5 
where 5Ji = 0.005. We performed at least 5 different runs 
at each Ji in the range |Ji| < 1.15. Note the different scales 
on the vertical axes: the excess entropies are measured in 
bits of apparent memory; the structure factor magnitudes are 
arbitrary. For more discussion, see text. 



behavior of the various structure factors; the structure 
factors selected correspond to periods of 2, 4, and 1 lat- 
tice sites. 

Physically, when Ji is large in magnitude and neg- 
ative, the tendency for nearest neighbors to anti-align 
dominates and the system's ground state is antiferromag- 



netic: a checkerboard pattern consisting of alternating 
up and down spins. This pattern has a spatial period of 
2. Not surprisingly, the period-2 structure factor S{2) in 
Fig. H(b) shows a strong signal in this low- Ji regime. 

When Ji is near zero, the NN interactions are negligi- 
ble compared to the NNN interactions. Thus, each spin 
orients opposite its four next-nearest neighbors, while 
disregarding its four nearest neighbors. The result is that 
the lattice effectively decouples into four, noninteracting 
sublattices. On each of these sublattices the spins al- 
ternate in sign, resulting in a ground state with spatial 
period 4. Note that the period-4 structure factor 5(4) in 
Fig. ^(c) has a large value near Ji = 0, indicating this 
period-4 ordering. 

As Ji is increased from 0, the tendency for the spins 
to align grows stronger. Eventually this NN interac- 
tion overwhelms the NNN interactions and the entire 
lattice starts to align. This is the familiar paramagnet- 
ferromagnet transition. Above Ji w 2.5 the system ac- 
quires a net magnetization; there is now an unequal num- 
ber of up and down spins, whereas below Ji « 2.5 there 
are always, on average, equal numbers of up and down 
spins. This transition is signaled by the distinct spike 
in the period-1 structure factor S'(l) near Ji « 2.5 and 
S'(l)'s vanishing at larger Ji. (The magnetic suscepti- 
bility X diverges at the critical point of a ferromagnet- 
paramagnet transition. Since x ^ one expects to 

see a spike in 5(1) near this transition where the system 
acquires a non-zero magnetization.) 

In Fig. ||(d) and ||(e) we plot the mutual-information 
excess entropy Ej and the convergence excess entropy 
Ec versus Ji over the same parameter range. In the 
large and negative Ji regime Ej = Ec = 1 bit, indi- 
cating that there is one bit of information stored in the 
configurations. The configurations have a simple struc- 
ture (alternating up-down spins) and the magnitude of E 
gives the information needed to specify the spatial phase 
of the period-2 configurations. When Ji is large and the 
system undergoes the transition to ferromagnetic order- 
ing, Ei = Ec — 0, since the configurations consist of 
all aligned spins, and there is no spatial information or 
structure in them. In the intermediate regime (Ji « 0), 
Ei and Ec are markedly larger, indicating that the sys- 
tem is more structured than elsewhere. We will return 
shortly to discuss in detail what the values of Ej and Ec 
mean. 

Note that each excess entropy is sensitive to correla- 
tions at all periodicities, despite the fact that each is 
merely a single, unparameterized function. In contrast, 
the structure factors S{p) are a one-parameter family of 
functions that must be tuned a posteriori to find rele- 
vant periodic structure. That is, the period-1 structure 
factor 5(1) detects only the period-1 correlations near 
Ji — 2.5. Moreover, 5(1) is unable to distinguish be- 
tween the period-2 and period-4 orderings at Ji < —3.0 
and Ji « 0, respectively; 5(1) « 1 for both period-2 and 
period-4 configurations. 

Since the excess entropy E is a single, unparameterized 
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function sensitive to structure of any periodicity, it is a 
more general measure of structure and correlation than 
the structure factors S{p). Conversely, S{p) is somewhat 
myopic. By considering only two-point correlations mod- 
ulated at a selected periodicity p, S{p) misses structure 
that is either aperiodic or that is due to morc-than-two- 
spin correlations. In fact, E is even more sensitive and 
general that these observations indicate. 



C. E Distinguishes Structurally Distinct Ground 
States 

Looking closely at the mutual-information excess en- 
tropy Ei near Ji — in Fig. ||(d), one notices that the 
curve splits in two in the JJi| < 1.0 region. This can be 
seen more clearly in Fig. 0, in which we plot Ei versus 
Ji in this region. We sampled the NN coupling Ji every 
0.01 and we performed at least five different runs at each 
Ji value. Sometimes Ej = 3.0 bits, whereas for other 
trials Ei = 2.0 bits. Why are there two different values 
for Ei on different runs? And why, in contrast, is the 
period-4 structure factor 5(4) the same for all runs? 

The answer is simple: there are multiple structurally 
distinct ground states. The three possible ground-state 
configurations are shown in Fig. ^j. Note that for each 
ground state, all NNN pairs of sites have opposite spin 
values, thus minimizing the system's energy. Note also 
that each ground state is identical if one considers only a 
horizontal or vertical slice; the repeating pattern of two 
up spins followed by two down spins is the same. 

After a long transient time, the system usually settles 
into one of these three states. A boundary defect between 
two different ground states has an energy cost associated 
with it. As such, most boundaries are eventually de- 
stroyed. Incidentally, the dynamics through which this 
removal of boundary defects occurs is rather subtle and 
can be very long-lived. For example, a boundary between 
left and right diagonal phases costs more than a bound- 
ary between the checkerboard and one of the striped 
patterns. As a result, when the two different striped 
phases come close, the checkerboard pattern emerges be- 
tween them, pushing the stripe boundaries away from 
each other. Moreover, as the temperature approaches 
zero, we observe that there are times when the ground 
state is simply not found via single-flip Metropolis Monte 
Carlo dynamics. Similar phenomena have been observed 
in other antiferromagnetic Ising models; for recent work, 
see Refs. g ||, |§. 

In any event, a straightforward calculation shows 
that Ei = 3 bits for the checkerboard configuration of 
Fig. ||(a), whereas Ei = 2 bits for the two striped phases. 
(Similar calculations show that Ec = 3 bits for both the 
checkerboard and striped ground states.) Note, however, 
that S'(4) is the same for all three ground states. By 
construction, S'(4) measures only two-spin statistics ob- 
tained by considering correlations along a horizontal or 
a vertical direction. And so, the three ground states are 
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FIG. 4: The mutual-information excess entropy Ei showing 
the existence of multiple period-4 ground states. 



the same if one considers only isolated horizontal or ver- 
tical slices; every slice consists of a repeating pattern of 
two up spins followed by two down spins. Of course, one 
can adapt the definition of S{p) to account for the diag- 
onal striped phases, but this simply begs the question of 
discovering the intrinsic patterns in the first place. 

Near | Ji | = 1 notice that Ei and Ec occur in plateaus 
between 2 and 3 bits and above. This indicates that 
the system has settled into a number of more structured 
metastable states consisting of mixtures of the three 
ground states. 

In summary, we see that the mutual information ex- 
cess entropy Ei is capable of distinguishing between pat- 
terns that are not distinct according to the structure fac- 
tors S{p). In fact, we initially did not anticipate the 
two striped ground states, glibly assuming that the only 
ground state is the checkerboard. Our results for Ei, 
which we initially found confusing, led us to examine 
the configurations more closely and to detect the dis- 
tinct ground state structures. This, in turn, led us to no- 
tice the rich dynamics of the configurations as they wend 
their way towards one of the three ground states. In 
short, these structural subtleties would have been missed 
entirely had we relied solely on the structure factors. 




(a) (b) (c) 



FIG. 5: The three ground states for Ji ^ 0, J2 < 0: (a) 
checker board, (b) left-diagonal stripe, and (c) right-diagonal 
stripe. 
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V. DISCUSSION AND CONCLUSION 

We have introduced three extensions of the excess 
entropy that apply to two-dimensional configurations. 
Each excess entropy expression is based on a different 
way of viewing the one-dimensional excess entropy: the 
convergence excess entropy Eq measures the manner in 
which finite-template entropy density estimates converge 
to their asymptotic value; the subextensive excess en- 
tropy Es is related to the subextensive forms of the block 
entropy H(M, N); and the mutual information excess en- 
tropy Ei, is defined as the mutual information between 
two halves of a configuration. 

Applying two of these measures, Ec and Ej, to the 
NNN Ising model, we have seen that these quantities cap- 
ture the structural changes this system undergoes as its 
parameters are varied. In contrast, the structure factors 
are sensitive to periodic ordering of a particular period. 
Moreover, our results show that the information excess 
entropy Ei cleanly distinguishes between two period-4 
ground states, whereas the period-4 structure factor is 
simply incapable of making such a structural distinc- 
tion. Finally, the values that the excess entropies take 
on arc interpretable and give a quantitative measure of 
the amount of structure in the system. 

The picture that emerges, then, is that the various two- 
dimensional excess entropies behave as expected; they 
are clearly general purpose measures of two-dimensional 
structure. The excess entropy, being sensitive to multi- 
spin correlations, is capable of capturing patterns that 
a particular structure factor misses. The excess entropy 
does not decompose a pattern into periodic components, 
reporting instead a measure of the total amount of ap- 
parent information in a system. 

The goal of this work is not to suggest that the ex- 
cess entropy replace structure factors or, more generally, 
Fourier analysis. We view the excess entropy not in com- 
petition with Fourier analysis, but complementary to it; 
the excess entropy is designed to answer a different set of 
questions than those addressed by Fourier components. 
For example, it has long been appreciated in dynamical 
systems that power spectral analysis is of little help in 
revealing the geometry of a chaotic attractor . Anal- 
ogously, spectral decomposition typically will say little 
about how difficult it is to learn or synchronize to a pat- 
tern. 

Clearly, however, there is much more work to be done 
to develop a thorough, well understood methodology for 
two-dimensional patterns. One possible approach builds 
on Refs. |4[ |2^, ^ which take a systematic look at en- 
tropy growth and convergence by using a discrete cal- 
culus. This work places several complexity measures 
within a common framework and leads to new measures 
of structure. From the study presented above, we con- 
clude that a similar analysis in two dimensions, using a 
two-dimensional discrete calculus, holds great promise. 



Another area for future research concerns developing 
relationships between measures of complexity of a pat- 
tern and the difficulty of learning or synchronizing to 
it. There has been recent work on this in one dimen- 
sion H ^. For example, in Refs. Q we showed that 
the transient information [ P9| , an information-theoretic 
quantity complementary to the excess entropy, measures 
the total uncertainty experienced by an observer who, 
given an accurate model of a process, must synchronize 
to it. Synchronization, in this sense, means determining 
with certainty in which internal state the process is. Es- 
tablishing a similar result in 2D would be a significant 
aid in understanding new aspects of higher-dimensional 
patterns. 

There are also, of course, a host of additional statistical 
mechanical systems, each with its own range of distinct 
structures, that should be similarly analyzed. Calculat- 
ing excess entropies for them will facilitate developing our 
understanding of the behavior of these different quanti- 
ties and may even lead to discovering novel structural 
properties. A natural choice is calculating the behavior 
of E near the critical temperature, extracting critical ex- 
ponents, and relating these exponents to others for the 
well studied nearest-neighbor Ising model. It will also be 
of interest to calculate the various excess entropy forms 
for noisy Sierpinsky carpets and the like; this will allow 
for direct comparison with calculations of the measures 
of inhomogeneity put forth in Refs. ||ri|, p^ . 

Ultimately, these different measures of structure — 
those presented here and those developed by other au- 
thors — will be judged not solely by their ability to 
shed light on existing, well understood model systems 
such as the NNN Ising model considered here. Instead, 
the broader concern is how to use these information- 
theoretic quantities to capture structure and patterns 
in systems that are less well understood. Equally im- 
portant is the question of establishing relationships be- 
tween information-theoretic measures of structural com- 
plexity and other quantities, including: physical mea- 
sures of structure and correlation; computation-theoretic 
properties; and the difficulty of learning a pattern. 
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